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Abstract 

We present a method to solve initial and boundary value problems using artificial neural 
networks. A trial solution of the differential equation is written as a sum of two parts. The first 
part satisfies the initial/boundary conditions and contains no adjustable parameters. The second 
part is constructed so as not to affect the initial/boundary conditions. This part involves a feed- 
forward neural network, containing adjustable parameters (the weights). Hence by construction 
the initial/boundary conditions are satisfied and the network is trained to satisfy the differential 
equation. The applicability of this approach ranges from single ODE's, to systems of coupled 
ODE's and also to PDE's. In this article we illustrate the method by solving a variety model 
problems and present comparisons with finite elements for several cases of partial differential 
equations. 
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1 Introduction 



Many methods have been developed so far for solving differential equations. Some of them produce a 
solution in the form of an array that contains the value of the solution at a selected group of points. 
Others use basis-functions to represent the solution in analytic form and transform the original 
problem usually in a system of linear equations. Most of the previous work in solving differential 
equations using neural networks is restricted to the case of solving the linear systems of algebraic 
equations which result from the discretization of the domain. The solution of a linear system of 
equations is mapped onto the architecture of a Hopfield neural network. The minimization of the 
network's energy function provides the solution to the system of equations P] . 

Another approach to the solution of ordinary differential equations is based on the fact that 
certain types of splines, for instance linear B-splines, can be derived by the superposition of piecewise 
linear activation functions [§, The solution of a differential equation using linear B-splines as 
basis functions, can be obtained by solving a system of linear or non-linear equations in order to 
determine the parameters of the splines. Such a solution form is mappped directly on the architecture 
of a feedforward neural network by replacing each spline with the sum of piecewise linear activation 
functions that correspond to the hidden units. This method considers local basis-functions and 
in general requires many splines (and consequently network parameters) in order to yield accurate 
solutions. Furthermore it is not easy to extend these techniques to multidimensional domains. 

In this article we view the problem from a different angle. We present a general method for 
solving both ordinary differential equations (ODEs) and partial differential equations (PDEs), that 
relies on the function approximation capabilities of feedforward neural networks and results in the 
construction of a solution written in a diferentiable, closed analytic form. This form employs a 
feedforward neural network as the basic approximation element, whose parameters (weights and 
biases) are adjusted to minimize an appropriate error function. To train the network we employ 
optimization techniques, which in turn require the computation of the gradient of the error with 
respect to the network parameters. In the proposed approach the model function is expressed as the 
sum of two terms: the first term satisfies the initial/boundary conditions and contains no adjustable 
parameters. The second term involves a feedforward neural network to be trained so as to satisfy 
the differential equation. Since it is known that a multilayer perceptron with one hidden layer can 
approximate any function to arbitrary accuracy, it is reasonable to consider this type of network 
architecture as a candidate model for treating differential equations. 

The employement of a neural architecture adds to the method many attractive features: 

• The solution via ANN's is a dijjerentiable, closed analytic form easily used in any subsequent 
calculation. Most other techniques offer a discrete solution (for example predictor-corrector, or 
Runge-Kutta methods) or a solution of limited differentiability (for example finite elements). 
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• Such a solution is characterized by the generahzation properties of neural networks, which are 
known to be superior. (Comparative results presented in this work illustrate this point clearly.) 

• The required number of model parameters is far less than any other solution technique and 
therefore, compact solution models are obtained, with very low demand on memory space. 

• The method is general and can be applied to ODEs, systems of ODEs and to PDEs as well. 

• The method can be realized in hardware, using neuroprocessors, and hence offer the opportu- 
nity to tackle in real time difficult differential equation problems arising in many engineering 
applications. 

• The method can also be efficiently implemented on parallel architectures. 

In the next section we describe the general formulation of the proposed approach and derive 
formulas for computing the gradient of the error function. Section 3 illustrates some classes of 
problems where the proposed method can be applied and describes the appropriate form of the trial 
solution. Section 4 presents numerical examples from the application of the technique to several test 
problems and provides details concerning the implementation of the method and the accuracy of the 
obtained solution. We also make a comparison of our results with those obtained by the finite element 
method for the examined PDE problems. Finally, section 6 contains conclusions and directions for 
future research. 

2 Description of the method 

The proposed approach will be illustrated in terms of the following general differential equation 
definition: 



subject to certain boundary conditions (B.Cs) (for instance Dirichlet and/or Neumann), where x = 
{xi, . . . ,Xn) € -R", D C denotes the definition domain and "^{x) is the solution to be computed. 
The proposed approach can be also applied to differential equations of higher order, but we have not 
considered any problems of this kind in the present work. 

To obtain a solution to the above differential equation the collocation method is adopted which 
assumes a discretization of the domain D and its boundary S into a set points D and S respectively. 
The problem is then transformed into the following system of equations: 



G{x, ^(x), V^(f), V^^(f)) =0,xeD 



(1) 



G{x,,^{xi),V^{xi),V^^{xi)) = 0,Vxl G D 



(2) 



subject to the constraints imposed by the B.Cs. 
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If p) denotes a trial solution with adjustable parameters p, the problem is transformed to: 

mirij^Yl G{xi,<i/t{xi,p),V<i/{xi,p),V^<i/{xi,p)f (3) 

XieD 

subject to the constraints imposed by the B.Cs. 

In the proposed approach the trial solution employs a feedforward neural network and the 
parameters p correspond to the weights and biases of the neural architecture. We choose a form for 
the trial function \E'((x) such that by construction satisfies the BCs. This is achieved by writing it 
as a sum of two terms: 

^^(f) = A{x) + F{x, N{x,p)) (4) 

where N{x,p) is a single-output feedforward neural network with parameters p and n input units fed 
with the input vector x. 

The term A{x) contains no adjustable parameters and satisfies the boundary conditions. The 
second term F is constructed so as not to contribute to the BCs, since ^t(a^) must also satisfy them. 
This term employs a neural network whose weights and biases are to be adjusted in order to deal 
with the minimization problem. Note at this point that the problem has been reduced from the 
original constrained optimization problem to an unconstrained one (which is much easier to handle) 
due to the choice of the form of the trial solution that satisfies by construction the B.Cs. 

In the next section we present a systematic way to construct the trial solution, i.e. the functional 
forms of both A and F. We treat several common cases that one frequently encounters in various 
scientific fields. As indicated by our experiments, the approach based on the above formulation is very 
effective and provides in reasonable computing time accurate solutions with impressive generalization 
(interpolation) properties. 

2.1 Gradient Computation 

The efficient minimization of equation (3) can be considered as a procedure of training the neural 
network where the error corresponding to each input vector Xi is the value G{xi) which has to 
become zero. Computation of this error value involves not only the network output (as is the case 
in conventional training) but also the derivatives of the output with respect to any of its inputs. 
Therefore, in computing the gradient of the error with respect to the network weights, we need to 
compute not only the gradient of the network but also the gradient of the network derivatives with 
respect to its inputs. 

Consider a multilayer perceptron with n input units, one hidden layer with H sigmoid units and 
a linear output unit. The extension to the case of more than one hidden layers can be obtained 
accordingly. For a given input vector x — {xi, . . . , x„) the output of the network is N — 'Vicrizi) 
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where Zi = J2j=iWijXj + Ui, Wij denotes the weight from the input unit j to the hidden unit i, Vi 
denotes the weight from the hidden unit i to the output, Ui denotes the bias of hidden unit i and 
a{z) is the sigmoid transfer function. It is straightforward to show that: 



dxj 



H 



where CTj = trizi) and a^'^^ denotes the k^^ order derivative of the sigmoid. Moreover it is readily 
verifiable that: 

^ ^ N = Y.v.P.a^ (6) 



where 



dx^' 8x2" ' ' ' dx^ 

n 

Pr=X{ ^'k' (7) 



k=l 

and A = Er=i Ai. 

Equation (6) indicates that the derivative of the network with respect to any of its inputs is 
equivalent to a feedforward neural network Ng{x) with one hidden layer, having the same values for 
the weights Wij and thresholds Ui and with each weight Vi being replaced with ViPi. Moreover the 
transfer function of each hidden unit is replaced with the A*^ order derivative of the sigmoid. 

Therefore the gradient of A^^^ with respect to the parameters of the original network can be easily 
obtained as: 

P.at^ (8) 



dui 

dN, 



dvi 

v.P.ar" (9) 



p^(A+l) 



x.v^P^at'-'^+vA.w^f'i n ^^)^^ (10) 

Once the derivative of the error with respect to the network parameters has been defined it is 
then straightforward to employ almost any minimization technique. For example it is possible to 
use either the steepest descent (i.e. the backpropagation algorithm or any of its variants), or the 
conjugate gradient method or other techniques proposed in the literature. In our experiments we 
have employed the BFGS method P] that is quadraticly convergent and has demonstrated excellent 
performance. It must also be noted that for a given grid point the derivatives of each network (or 
gradient network) with respect to the parameters may be obtained simultaneously in the case where 
parallel harware is available. Moreover, in the case of backpropagation, the on-line or batch mode of 
weight updates may be employed. 
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3 Illustration of the method 



3.1 Solution of single ODEs and Systems of coupled ODEs 

To illustrate the method, we consider the first order ODE: 



dx 

with X e [0, 1] and with the IC ^(0) = A. 
A trial solution is written as: 

-^t{x) = A + xN{x, p) (11) 

where N{x, p) is the output of a feedforward neural network with one input unit for x and weights 
p. Note that ^((x) satisfies the IC by construction. The error quantity to be minimized is given by: 

E[p\=j:{^^-f{x.,ux.))r (12) 

i 

where the Xj's are points in [0, 1]. Since d^t{x)/dx — N{x,p) + xdN{x,p)/dx, it is straightforward 
to compute the gradient of the error with respect to the parameters p using equations (5)-(10). The 
same holds for all subsequent model problems. 

The same procedure can be applied to the second order ODE: 

For the initial value problem: ^(0) = A and ^^(0) = A', the trial solution can be cast as: 

^t{x)^A + A'x + x^N{x,p} (13) 

For the two point Dirichlet BC: ^(0) = A and ^'(l) = B, the trial solution is written as: 

^t{x)^A{l-x) + Bx + x{l-x)N{x,p) (14) 

In the above two cases of second order ODEs the error function to be minimized is given by equation 
(12). 

For systems of K first order ODEs 

d'if 

^ = /,(x,*i,*2,...*i.) 

with ^i{0) — Ai, {i — 1,. . . ,K) we consider one neural network for each trial solution {i — 
1, . . . ,K) which is written as: 

^t,ix)^Ai + xNi{x,pi) (15) 
and we minimize the following error quantity: 

m = E E{%r^ - M^i^ ^t.,^t.,---, %.)r (16) 

k=i i 
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3.2 Solution of single PDEs 

We treat here two-dimensional problems only. However it is straightforward to extend the method 
to more dimensions. For example consider the Poisson equation: 

;^{x,y) + —^{x,y) ^ f{x,y) 



X e [0,1], y G [0,1] with Dinchlet BC: ^{0,y) = fo{y), ^{l,y) = fi{y) and ^{x,0) = go{x), 
^(x, 1) = gi{x). The trial solution is written as: 

^t{x, y) = A{x, y) + x{l - x)y{l - y)N{x, y,p) (17) 

where A{x, y) is chosen so as to satisfy the BC, namely: 

A{x,y) = {l-x)fo{y)+xh{y) + {l-y){go{x)-[{l-x)go{Q)+xgo{l)]}+^^^ 

(18) 

For mixed boundary conditions of the form: ^(0, y) = fo{y), \I/(1, y) = f\{y), \E'(x, 0) = goix) and 
^\E'(x, 1) = gi{x) (i.e. Dirichleton part of the boundary and A''e?imann elsewhere) , the trial solution 
is written as: 

*t(x,y) = B{x,y)+x{l - x)y[N{x,y,p} - N{x, - ^N{x, l,p)] (19) 

and B{x, y) is again chosen so as to satisfy the BCs: 

B{x,y) = (l-a;)/o(2/)+a;/i(|/)+5o(a;)-[(l-a;)5o(0)+a;5o(l)]+2/{5i(a;)-[(l-%i(0)+a;5i(l)]} (20) 

Note that the second term of the trial solution does not affect the boundary conditions since it 
vanishes at the part of the boundary where Dirichlet BCs are imposed and its gradient component 
normal to the boundary vanishes at the part of the boundary where Neumann BCs are imposed. 

In all the above PDE problems the error to be minimized is given by: 

E\p\ = E{^*(^- Vi) + - /(^- yi)y (21) 

where {xi,yi) are points in [0, 1] x [0, 1]. 



4 Examples 

In this section we report on the solution of a number of model problems. In all cases wc used a 
multilayer perceptron having one hidden layer with 10 hidden units and one linear output unit. The 
sigmoid activation of each hidden unit is a{x) = jr^^- For each test problem the exact analytic 
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solution \l/a(a;) was known in advance. Therefore we test the accuracy of the obtained solutions by 
computing the deviation A\E'(x) = \E't(x) — \E'a(x). To perform the error minimization we employed the 
Merlin Q optimization package. Merlin provides an environment for multidimensional continuous 
function optimization. From the several algorithms that are implemented therein, the Quasi-Newton 
BFGS method seemed to perform better in these kind of problems and hence we used it in all of 
our experiments. A simple criterion for the gradient norm was used for termination. 

In order to illustrate the characteristics of the solutions provided by the neural method, we provide 
figures displaying the corresponding deviation A\E'(x) both at the few points (training points) that 
were used for training and at many other points (test points) of the domain of each equation. The 
second kind of figures are of major importance since they show the interpolation capabilities of the 
neural solutions which seem to be superior compared to other solutions. Moreover, in the case of 
ODEs we also consider points outside the training interval in order to obtain an estimate of the 
extrapolation performance of the obtained solution. 

4.1 ODEs and systems of ODEs 
4.1.1 Problem 1 



+ (x H r)^ = x-^ + 2x + x 



dx 1 + X + x^ 1 + X + x^ 

with ^1^(0) = 1 and x G [0,1]. The analytic solution is ^aix) = i_^_^_^^3 + x'^ and is displayed 
in Figure la. According to equation (11) the trial neural form of the solution is taken to be: 
\E't(x) = l + xN{x,p). The network was trained using a grid of 10 equidistant points in [0,1]. Figure 2 
displays the deviation A\E'(x) from the exact solution corresponding at the grid points (small circles) 
and the deviation at many other points in [0, 1] as well as outside that interval (dashed line). It is 
clear that the solution is of high accuracy, although training was performed using a small number of 
points. Moreover, the extrapolation error remains low for points near the equation domain. 

4.1.2 Problem 2 

d ^ 1 T- _i / N 

— * + = e ^cos(x) 
dx 5 

with \E'(0) = and x G [0, 2]. The analytic solution is ^a{x) = sin{x) and is presented in Figure 
lb. The trial neural form is: '^t{x) = xN{x,p) according to equation (11). As before we used a grid 
of 10 equidistant points in [0,2] to perform the training. In analogy with the previous case. Figure 3 
display the deviation A\E'(x) at the grid points (small circles) and at many other points inside and 
outside the training interval (dashed line). 
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4.1.3 Problem 3 

^ I d ^ ^ 1 _x 

\^ H — \l/ + * = — e 5 cosx 

dx^ 5 dx 5 

Consider the initial value problem: \E'(0) = and ^\t'(0) = 1 with x E [0,2]. The exact solution is: 

\E'(x) = e^^sin(x) and the trial neural form is: \E't(a;) = x + x'^N{x,p) (from equation (13)). 

Consider also the boundary value problem: ^(0) = and ^E'(l) = sin{l)e^^>, x E [0, 1]. The exact 

solution is the same as above, but the appropriate trial neural form is: ^t{x) = xsin{l)e~5 + x{l — 

x)N{x,p) (from equation (14)). 

Again as before we used a grid of 10 equidistant points and the plots of the deviation from the 
exact solution are displayed at Figures 4 and 5 for the initial value and boundary value problem 
respectively. The interpretation of the figures is the same as in the previous cases. 

From all the above cases it is clear that method can handle effectively all kinds of ODEs and 
provides analytic solutions that remain to be of the same accuracy at points other from the training 
ones. 

4.1.4 Problem 4 

Consider the system of two coupled first order ODEs: 

-^^1 = cos{x) + + ^'2 - (1 + a;^ + sin^{x)) 
dx 

-^^'2 = 2x- {1 + x'^)sin{x) + ^'i^'2 
dx 

with X E [0,3] and ^i(O) = and ^2(0) = 1. The analytic solutions are ^ai{x) — sin{x) and 
^a2{x) — 1 + x"^ and are displayed at Figure 6a and 6b, respectively. Following equation (15) the 
trial neural solutions are: ^ti(x) = xNi{x,pi) and ^tjl^) = 1 + xN2{x,p2) where the networks A^i 
and N2 have the same architecture as in the previous cases. Results concerning the accuracy of 
the obtained solutions at the grid points (small circles) and at many other points (dashed line) are 
presented in Figure 7. 

4.2 PDEs 

We consider boundary value problems with Dirichlet and Neumann BCs. All subsequent problems 
were defined on the domain [0, 1] x [0, 1] and in order to perform training we consider a mesh of 
100 points obtained by considering 10 equidistant points of the domain [0, 1] of each variable. In 
analogy with the previous cases the neural architecture was considered to be a MLP with two inputs 
(accepting the coordinates x and y of each point), 10 sigmoid hidden units and one linear output 
unit. 
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4.2.1 Problem 5 

V^*(x, y) = e'^'ix -2 + y^ + 6y) 

with x,y e [0,1] and the Dirichlet BCs: ^(0,|/) = y^, \I/(1,1/) = (1 + y^)e~^ and ^(x, 0) = xe"^, 
1) = e~'^(a; + 1). The analytic solution is ?/) = e~^(a; + y^) and is displayed in Figure 

8. Using equation (17) the trial neural form must be written: \i/f(a;, ?/) = A{x,y) + x{l — x)y{l — 
y)N{x, y,p) and A{x, y) is obtained by direct substitution in the general form given by equation (18): 

A{x, y) = (1 - x)y^ + x{l + y^)e~^ + (1 - ?/)^(e~'' - e"^) + y[{l + x)e"'^ - (1 - x - 2x6"^)] 

Figure 9 presents the deviation ^^{x,y) of the obtained solution at the 100 grid points that were 
selected for training while Figure 10 displays the deviation at 900 other points of the equation domain. 
It clear that the solution is very accurate and the accuracy remains high at all points of the domain. 

4.2.2 Problem 6 

V^^'(x, y) = e-'^{[-^a^x - ^ + 2a2]cos(aV + y) + [^^ - 1 - 4aV + ^]sm(aV + y)} 

5 5 z5 z5 

with a = 3, y e [0,1] and the Dirichlet BCs as defined by the exact solution ^^(a;, y) = 
e ^sin(a'^x'^ + y) (presented in Figure 11). Again the trial neural form is: '^t{x^y) = A{x,y) + 
x{l — x)y{l — y)N{x,y,p) and A{x,y) is obtained similarly by direct substitution in equation (18). 
Accuracy results are presented in Figure 12 for the training points and in Figure 13 for test points. 
It can be shown that the accuracy is not the same as in the previous example, but it can be improved 
further by considering a neural network with more than 10 hidden units. From the figures it is also 
clear that the test error lies in the same range as the training error. 

4.2.3 Problem 7 

V^'^{x,y) = (2 - n^y^)sin{7!-x) 

with x,y e [0,1] and with mixed BCs: ^(0,|/) = 0, ^(1,?/) = and *(a;,0) = 0, ^^(x, 1) = 
2sin{7ix). The analytic solution is ^a(a^, y) = y'^sin{7rx) and is presented in Figure 14. The trial 
neural form is specified according to equation (19) 

d 

"^tix, y) = B{x, y) + x{l - x)y[N{x, y,p) - N{x, l,p) - -^-Nix, l,p)] 

dy 

where B{x^ y) is obtained by direct substitution in equation (20). The accuracy of the neural solution 
is depicted in Figures 15 and 16 for training and test points respectively. 
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Figure 1: Exact solutions of ODE problems 



4.2.4 Problem 8 

This is an example of a non-linear PDE. 

d 

V^^(a;, y) + ^(x, y)—'^{x, y) = sin{-nx){2 - n'^y'^ + 2y^sin{nx)) 

with the same mixed BCs as in the previous problem. The exact solution is again \E'a(x, = 
y'^sin{TTx) and the parametrization of the trial neural form is the same as in problem 7. No plots of 
the accuracy are presented since they are almost the same with those of problem 7. 



4.3 Comparison with Finite Elements 

The above PDE problems were also solved with the finite element method which has been widely 
acknowledged as one of the most effective approaches to the solution of differential equations. The 
characteristics of the finite element method employed in this work are briefly summarized below. In 
the finite element approach the unknowns are expanded in piecewise continuous biquadratic elements 

ig: 

<il = Y.^M^,n) (22) 

i=l 

where $j is the biquadratic basis function and is the unknown at the i*'^ node of the element. 
The physical domain [x, y) is mapped on the computational domain n) through the isoparametric 
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Figure 2: Problem 1: Accuracy of the computed solution. 



Figure 3: Problem 2: Accuracy of the computed solution. 
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Figure 4: Problem 3 with initial conditions: Accuracy of the computed solution. 
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Figure 5: Problem 3 with boundary conditions: Accuracy of the computed solution. 
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Figure 6: Exact solutions of the system of coupled ODEs. 
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Figure 7: Problem 4: Accuracy of the computed solutions. 
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Figure 8: Exact solution of PDE problem 5. 





Figure 10: Problem 5: Accuracy of the computed solution at the test points. 
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Figure 12: Problem 6: Accuracy of the computed solution at the training points. 




Figure 13: Problem 6: Accuracy of the computed solution at the test points. 
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Figure 14: Exact solution of PDE problems 7 and 8. 




Figure 15: Problem 7: Accuracy of the computed solution at the training points. 
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Figure 16: Problem 7: Accuracy of the computed solution at the test points. 

mapping: 

x^f^^MCn) (23) 

i=l 

y = j2y^M^,n) (24) 
1=1 

where ^ and n are the local coordinates in the computational domain (0 < < 1) and Xi, yi the 
i*'* node coordinates in the physical domain for the mapped element. 

The used Galerkin Finite Element Method (GFEM) calls for the weighted residuals Ri to vanish 
at each nodal position i: 

Ri= [ G{x,y)det{J)d^dn^O (25) 

JD 

where G is given by equation (1) and J is the Jacobian of the isoparametric mapping: 

, ./ dx dy dx dy 

This requirement along with the imposed boundary conditions constitute a set of nonlinear algebraic 
equations {Ri — 0). The inner products involved in the finite element formulation are computed using 
the nine-node Gaussian quadrature. The system of equations is solved for the nodal coefficients of the 
basis function expansion using the Newton's method forming the Jacobian of the system explicitly 
(for both linear and nonlinear differential operators): 

5A^("+i) = -R (27) 



19 





Neural Method 


Finite Element 


Probleiii No. 


Training set 


Iiiteri)olatioii set 


Training set 


Interi)olatioii set 


5 


5 X 10-^ 


5 X 10-^ 


2 X 10-8 


1.5 X 10-^ 


6 


0.0015 


0.0015 


0.0002 


0.0025 


7 


6 X 10-^ 


6 X 10"^ 


7 X 10-^ 


4 X 10-5 


8 


1.5 X 10~^ 


1.5 X 10"^ 


6 X 10-^ 


4 X 10-5 



Table 1: Maximum deviation from the exact solution for the neural and the finite element methods. 

^(n+l) ^ ^(n) _^ (^28) 

where the superscript n denotes the iteration number and B is the global Jacobian of the system of 
equations R: 

The initial guess is chosen at random. For linear problems convergence is achieved in one 
iteration and for non-linear problems in 1-5 iterations. 

All PDE problems 5-8 are solved on a rectangular domain of 18 x 18 elements resulting in a linear 
system with 1369 unknowns. This is in contrast with the neural approach which assumes a small 
number of parameters (30 for ODEs and 40 for PDEs), but requires more sophisticated minimization 
algorithms. As the number of employed elements increases the finite element approach requires 
an excessive number of parameters. This fact may lead to memory requirements that exceed the 
available memory resources. 

In the finite element case, interpolation is performed using a rectangular grid of 23 x 23 equidis- 
tant points (test points). For each pair of nodal coordinates {x,y) of this grid, we correspond a 
pair of local coordinates (^, n) of a certain element of the original grid where we have performed the 
computations. The interpolated values are computed as: 

^{^,n)=j2%m,n) (30) 

i=l 

for the element that corresponds to the global coordinates {x,y). It is clear that the solution is not 
expressed in closed analyical form as in the neural case, but additional computations are required in 
order to find the value of the solution at an arbitrary point in the domain. Figures 17-22 display the 
deviation \^{x, y) — ^a{x, y)\ for PDE problems 5-7 (figures concerning problem 8 are similar to those 
of problem 7). For each problem two figures are presented displaying the deviation at the training 
set and at the interpolation set of points respectively. Table 1 reports the maximum deviation 
corresponding to the neural and to the finite element method at the training and at the interpolation 
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Figure 17: Problem 5: Accuracy of the FEM solution at the training points. 

set of points. It is obvious that at the training points the solution of the finite element method is very 
satisfactory and in some cases it is better than that obtained using the neural method. It is also clear 
that the accuracy at the interpolation points is orders of magnitude lower as compared to that at 
the training points. On the contrary, the neural method provides solutions of excellent interpolation 
accuracy, since, as Table 1 indicates, the deviations at the training and at the interpolation points 
are comparable. It must also be stressed that the accuracy of the finite element method decreases as 
the size of the grid becomes smaller, and that the neural approach considers a mesh of 10x10 points 
while the in the finite element case a 18x18 mesh was employed. 

5 Conclusions and Future Research 

A method has been presented for solving differential equations that relies upon the function ap- 
proximation capabilities of the feedforward neural networks and provides accurate and diffcrcntiable 
solutions in a closed analytic form. The success of the method can be attributed to two factors. 
The first one is the employment of neural networks that are excellent function approximators and 
the second is the form of the trial solution that satisfies by construction the BCs and therefore the 
constrained optimization problem becomes a substantially simpler unconstrained one. 

Unlike most previous approaches, the method is general and can be applied to both ODEs and 
PDEs by constructing the appropriate form of the trial solution. As indicated by our experiments 
the method exhibits excellent generalization performance since the deviation at the test points was 
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Figure 18: Problem 5: Accuracy of the FEM solution at the test points. 
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Figure 19: Problem 6: Accuracy of the FEM solution at the training points. 
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Figure 20: Problem 6: Accuracy of the FEM solution at the test points. 




Figure 21: Problem 7: Accuracy of the FEM solution at the training points. 
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Figure 22: Problem 7: Accuracy of the FEM solution at the test points. 



in no case greater than the maximum deviation at the training points. This is in contrast with the 
finite element method where the deviation at the testing points was significantly greater than the 
deviation at the training points. 

We note that the neural architecture employed was fixed in all the experiments and we did not 
attempt to find optimal configurations or to study the effect of the number of hidden units on the 
performance of the method. Moreover, we did not consider architectures containing more than one 
hidden layers. A study of the effect of the neural architecture on the quahty of the solution constitutes 
one of our research objectives. 

Another issue that needs to be examined is related with the sampling of the grid points that are 
used for training. In the above experiments the grid was constructed in a simple way by considering 
equidistant points. It is expected that better results will be obtained in the case where the grid 
density will vary during training according to the corresponding error values. This means that it is 
possible to consider more training points at regions where the error values are higher. 

It must also be stressed that the proposed method can easily be used for deahng with domains 
of higher dimensions (three or more) . As the dimensionality increases, the number of training points 
becomes large. This fact becomes a serious problem for methods that consider local functions around 
each grid point since the required number of parameters becomes excessively large and, therefore, 
both memory and computation time requirements become intractable. In the case of the neural 
method the number of training parameters remains almost fixed as the problem dimensionality 
increases. The only effect on the computation time stems from the fact that each training pass 
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requires the presentation of more points, i.e. the training set becomes larger. This problem can be 
tackled by considering either parallel implementations, or implementations on a neuroprocessor that 
can be embedded in a conventional machine and provide considerably better execution times. Such 
an implementation on neural hardware is one of our near future objectives, since it will permit the 
treatment of many difficult real world problems. Finally we aim at extending the approach to treat 
other problems of similar nature, as for example eigenvalue problems for differential operators. 

One of us (I. E. L.) acknowledges partial support from the General Secretariat of Research and 
Technology under contract PENED 91 ED 959. 
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